Quality of Service of an Asynchronous Crash-Recovery Leader Election Algorithm

نویسندگان

  • Vinícius A. Reis
  • Gustavo M. D. Vieira
چکیده

In asynchronous distributed systems it is very hard to assess if one of the processes taking part in a computation is operating correctly or has failed. To overcome this problem, distributed algorithms are created using unreliable failure detectors that capture in an abstract way timing assumptions necessary to assess the operating status of a process. One particular type of failure detector is a leader election, that indicates a single process that has not failed. The unreliability of these failure detectors means that they can make mistakes, however if they are to be used in practice there must be limits to the eventual behavior of these detectors. These limits are defined as the quality of service (QoS) provided by the detector. Many works have tackled the problem of creating failure detectors with predictable QoS, but only for crash-stop processes and synchronous systems. This paper presents and analyzes the behavior of a new leader election algorithm named NFD-L for the asynchronous crash-recovery failure model that is efficient in terms of its use of stable memory and message exchanges.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Leader Election in Distributed Systems with Crash Failures

Leader election is an important problem in distributed computing. Garcia-Molina's Bully Algorithm is a classic solution to leader election in synchronous systems with crash failures. This paper shows that the Bully Algorithm can be easily adapted for use in asynchronous systems. First, we re-write the Bully Algorithm to use a failure detector, instead of explicit time-outs; this yields a modula...

متن کامل

Designing and Evaluating Fault-tolerant Leader Election Algorithms

Fault-tolerant leader election is a basic building block for dependable distributed computing, since it allows coordinating decisions among replicas such that they remain consistent. Indeed, several fault-tolerant agreement protocols rely on an eventual leader election service. This problem has been initially studied in crash-prone systems, and more recently in other failure scenarios, e.g., cr...

متن کامل

Leader Election in Asynchronous Distributed Systems

In a previous paper, Garcia-Molina speci es the leader election problem for synchronous and asynchronous distributed systems with crash and link failures and gives an elegant algorithm for each type of system. This paper points out a aw in GarciaMolina's speci cation of leader election in asynchronous systems and proposes a new speci cation.

متن کامل

Optimal Distributed t-Resilient Election in Complete Networks

We study the problem of distributed leader election in an asynchronous complete network, in presence of faults that occurred prior to the execution of the election algorithm. Failures of this type are encountered, for example, during a recovery from a crash in the network. For a network with n processors, k of which start the algorithm and at most t of which might he faulty, we present an algor...

متن کامل

A Leader Election Protocol for Fault Recovery in Asynchronous Fully-Connected Networks

We introduce a new algorithm for consistent failure detection in asynchronous systems. Informally, consistent failure detection requires processes in a distributed system to distinguish between two diierent populations: a fault free population and a faulty one. The major contribution of this paper is in combining ideas from group membership and leader election, in order to have an election prot...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1704.06302  شماره 

صفحات  -

تاریخ انتشار 2017